Interview-1

Difference between Hive and Beeline. 

Hive CLI and Beeline both can be used to interact with Hive execution engine. But there are few differences between Hive CLI and Beeline.The primary difference between the two involves how the clients connect to Hive. 

  • The Hive CLI connects directly to the Hive Driver and requires that Hive be installed on the same machine as the client. 
  • However, Beeline connects to HiveServer2 and does not require the installation of Hive libraries on the same machine as the client. 
  • Beeline is a thin client that also uses the Hive JDBC driver but instead executes queries through HiveServer2, which allows multiple concurrent client connections and supports authentication.

Connection Method

Hive: To connect to any hive server we just need to type the command hive in the shell. It directly takes us to the hive prompt and we can execute the commands.

Beeline: To connect to beeline we need to use beeline command along with connection string. Below is a sample format of beeline connection command.

What are different modes available in Hive.

Hive can sometimes operate in two modes, which are 

  • MapReduce mode 
  • Local mode. 

Can Hive Queries Be Executed From Script Files? How?
Using the source command.
Hive> source /path/to/file/file_with_query.hql

Incremental Load In Hive Table

beeline -u <url> -n <username> -p <password>

CREATE VIEW reconcile_view AS
SELECT t1.* FROM
    (SELECT * FROM base_table
     UNION ALL
     SELECT * from incremental_table) t1
JOIN
    (SELECT id, max(modified_date) max_modified FROM
        (SELECT * FROM base_table
         UNION ALL
         SELECT * from incremental_table)
     GROUP BY id) t2
ON t1.id = t2.id AND t1.modified_date = t2.max_modified;


What Is The Significance Of The Line Set Hive.mapred.mode = Strict;
It sets the mapreduce jobs to strict mode.By which the queries on partitioned tables can not run without a WHERE clause. This prevents very large job running for long time.

How Do You Check If A Particular Partition Exists?

SHOW PARTITIONS table_name PARTITION(partitioned_column=’partition_value’)

What Is The Significance Of ‘if Exists” Clause While Dropping A Table?
When we issue the command DROP TABLE IF EXISTS table_name
Hive throws an error if the table being dropped does not exist in the first place.

https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_data-access/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html 


No comments:

Post a Comment